马尔可夫链蒙特卡洛方法用于从复杂分布和估计归一化常数采样的方法,通常会模拟沿着退火路径的一系列中间分布的样品,该路径桥梁在可缝隙的初始分布和目标密度之间桥接。先前的工作已经使用准算术手段构建了退火路径,并将所得的中间密度解释为最小化对终点的预期差异。我们在单调的密度函数嵌入下使用布雷格曼的分歧对这种“质心”属性进行了全面分析,从而将诸如Amari和Renyi的$ {\ alpha} $ - divergences等共同差异相关联,$ {(\ alpha,\ beta) } $ - 分歧,以及沿着退火路径的中间密度的詹森 - 香农脱落。我们的分析强调了使用Zhang 2004的Rho-Tau Bregman Divergence框架; 2013年的Rho-Tau Bregman Divergence框架之间的参数族之间的相互作用和分歧函数。
translated by 谷歌翻译
诸如最大熵正则化之类的政策正则化方法被广泛用于增强学习以提高学习政策的鲁棒性。在本文中,我们展示了这种鲁棒性是如何通过对冲的奖励功能扰动而产生的,奖励功能是从想象中的对手设定的限制设置中选择的。使用凸双重性,我们表征了KL和Alpha-Divergence正则化的一组强大的对抗奖励扰动集,其中包括香农和Tsallis熵正则定期为特殊情况。重要的是,可以在此强大集合中给出概括保证。我们提供了有关最坏的奖励扰动的详细讨论,并提供了直观的经验示例,以说明这种稳健性及其与概括的关系。最后,我们讨论我们的分析如何补充并扩展对对抗奖励鲁棒性和路径一致性最佳条件的先前结果。
translated by 谷歌翻译
我们扩展了时间差异(TD)学习,以获得风险敏感的无模型加强学习算法。该扩展可以被视为Rescorla-Wagner规则的修改,其中(六样)刺激被认为是过度或低估TD目标的事件。结果,获得从I.I.D的自由能量的随机近似规则。通过高斯分布产生的样本,具有未知的平均值和方差。由于已知高斯自由能量是对平均值和方差的确定性相当敏感,因此学习规则具有风险敏感决策的应用。
translated by 谷歌翻译
Crop type maps are critical for tracking agricultural land use and estimating crop production. Remote sensing has proven an efficient and reliable tool for creating these maps in regions with abundant ground labels for model training, yet these labels remain difficult to obtain in many regions and years. NASA's Global Ecosystem Dynamics Investigation (GEDI) spaceborne lidar instrument, originally designed for forest monitoring, has shown promise for distinguishing tall and short crops. In the current study, we leverage GEDI to develop wall-to-wall maps of short vs tall crops on a global scale at 10 m resolution for 2019-2021. Specifically, we show that (1) GEDI returns can reliably be classified into tall and short crops after removing shots with extreme view angles or topographic slope, (2) the frequency of tall crops over time can be used to identify months when tall crops are at their peak height, and (3) GEDI shots in these months can then be used to train random forest models that use Sentinel-2 time series to accurately predict short vs. tall crops. Independent reference data from around the world are then used to evaluate these GEDI-S2 maps. We find that GEDI-S2 performed nearly as well as models trained on thousands of local reference training points, with accuracies of at least 87% and often above 90% throughout the Americas, Europe, and East Asia. Systematic underestimation of tall crop area was observed in regions where crops frequently exhibit low biomass, namely Africa and South Asia, and further work is needed in these systems. Although the GEDI-S2 approach only differentiates tall from short crops, in many landscapes this distinction goes a long way toward mapping the main individual crop types. The combination of GEDI and Sentinel-2 thus presents a very promising path towards global crop mapping with minimal reliance on ground data.
translated by 谷歌翻译
Question Answering (QA) is a growing area of research, often used to facilitate the extraction of information from within documents. State-of-the-art QA models are usually pre-trained on domain-general corpora like Wikipedia and thus tend to struggle on out-of-domain documents without fine-tuning. We demonstrate that synthetic domain-specific datasets can be generated easily using domain-general models, while still providing significant improvements to QA performance. We present two new tools for this task: A flexible pipeline for validating the synthetic QA data and training downstream models on it, and an online interface to facilitate human annotation of this generated data. Using this interface, crowdworkers labelled 1117 synthetic QA pairs, which we then used to fine-tune downstream models and improve domain-specific QA performance by 8.75 F1.
translated by 谷歌翻译
The need for AI systems to provide explanations for their behaviour is now widely recognised as key to their adoption. In this paper, we examine the problem of trustworthy AI and explore what delivering this means in practice, with a focus on healthcare applications. Work in this area typically treats trustworthy AI as a problem of Human-Computer Interaction involving the individual user and an AI system. However, we argue here that this overlooks the important part played by organisational accountability in how people reason about and trust AI in socio-technical settings. To illustrate the importance of organisational accountability, we present findings from ethnographic studies of breast cancer screening and cancer treatment planning in multidisciplinary team meetings to show how participants made themselves accountable both to each other and to the organisations of which they are members. We use these findings to enrich existing understandings of the requirements for trustworthy AI and to outline some candidate solutions to the problems of making AI accountable both to individual users and organisationally. We conclude by outlining the implications of this for future work on the development of trustworthy AI, including ways in which our proposed solutions may be re-used in different application settings.
translated by 谷歌翻译
Opinion summarisation synthesises opinions expressed in a group of documents discussing the same topic to produce a single summary. Recent work has looked at opinion summarisation of clusters of social media posts. Such posts are noisy and have unpredictable structure, posing additional challenges for the construction of the summary distribution and the preservation of meaning compared to online reviews, which has been so far the focus of opinion summarisation. To address these challenges we present \textit{WassOS}, an unsupervised abstractive summarization model which makes use of the Wasserstein distance. A Variational Autoencoder is used to get the distribution of documents/posts, and the distributions are disentangled into separate semantic and syntactic spaces. The summary distribution is obtained using the Wasserstein barycenter of the semantic and syntactic distributions. A latent variable sampled from the summary distribution is fed into a GRU decoder with a transformer layer to produce the final summary. Our experiments on multiple datasets including Twitter clusters, Reddit threads, and reviews show that WassOS almost always outperforms the state-of-the-art on ROUGE metrics and consistently produces the best summaries with respect to meaning preservation according to human evaluations.
translated by 谷歌翻译
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments had a variety of challenges in areas such as evaluation, learning from offline data, and constraint satisfaction. Our paper describes these challenges in the hope that awareness of them will benefit future applied RL work. We also describe the way we adapted our RL system to deal with these challenges, resulting in energy savings of approximately 9% and 13% respectively at the two live experiment sites.
translated by 谷歌翻译
This paper presents a 1-D convolutional graph neural network for fault detection in microgrids. The combination of 1-D convolutional neural networks (1D-CNN) and graph convolutional networks (GCN) helps extract both spatial-temporal correlations from the voltage measurements in microgrids. The fault detection scheme includes fault event detection, fault type and phase classification, and fault location. There are five neural network model training to handle these tasks. Transfer learning and fine-tuning are applied to reduce training efforts. The combined recurrent graph convolutional neural networks (1D-CGCN) is compared with the traditional ANN structure on the Potsdam 13-bus microgrid dataset. The achievable accuracy of 99.27%, 98.1%, 98.75%, and 95.6% for fault detection, fault type classification, fault phase identification, and fault location respectively.
translated by 谷歌翻译
从职位发布获得的汇总数据为劳动力市场需求,新兴技能以及援助工作匹配提供了有力的见解。但是,大多数提取方法受到监督,因此需要昂贵且耗时的注释。为了克服这一点,我们建议通过弱监督提取技巧。我们利用欧洲的技能,能力,资格和职业分类法,通过潜在代表来找到工作广告的类似技能。该方法根据令牌级别和句法模式显示了强烈的正信号,优于基准。
translated by 谷歌翻译